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Abstract. Verification of properties of first order logic with two vari- 
^Nj ables FO 2 has been investigated in a number of contexts. Over arbitrary 

structures it is known to be decidable with NEXPTIME complexity, with 
finitely satisfiable formulas having exponential-sized models. Over word 
^H structures, where FO 2 is known to have the same expressiveness as unary 

^— I temporal logic, the same properties hold. Over finite labelled ordered 

f— ^ trees FO 2 is also of interest: it is known to have the same expressiveness 

(T"j as navigational XPath, a common query language for XML documents. 

Prior work on XPath and FO 2 gives a 2EXPTIME bound for satisfiability 

■,_ ' of FO 2 . In this work we give the first in-depth look at the complexity of 

L • FO 2 on trees, and on the size and depth of models. We show that the 

i__] doubly-exponential bound is not tight, and neither do the NEXPTIME- 

ry* completeness results from the word case carry over: the exact complexity 

O varies depending on the vocabulary used, the presence or absence of 

a schema, and the encoding used for labels. Our results depend on an 
_ 1 analysis of subformula types in models of FO 2 formulas, including tech- 

niques for controlling the number of distinct subtrees, the depth, and the 
ly, size of a witness to finite satisfiability for FO 2 sentences over trees. 

(N 

OS 

\Q 1 Introduction 

The complexity of verifying properties over a class of structures depends on 
both the specification language for properties and the class of structures. Full 
first-order logic (FO) has non-elementary complexity even when applied to very 

. . restricted structures - e.g. words. The two-variable fragment of FO, FO 2 , is known 

to have better properties. Satisfiability over arbitrary relational vocabularies is 
k> decidable, and satisfiable sentences have exponential-sized models |GKV97j . Over 

5_i words witness models can also be taken to be exponential, and the satisfiability 

problem is known to be NEXPTIME-complete, as it is over general structures 
IEVW02 . The satisfiability results over words extend to give bounds on many 
related verification problems [BLW12] . 

The NEXPTIME-completeness of FO 2 over both general structures and word 
structures raises the question of the impact of structural restrictions on analysis 
problems for FO 2 . Surprisingly the complexity of satisfiability for FO 2 on a class 
of structures satisfying a very simple graph-theoretic restriction - namely, finite 
trees - has not been investigated in detail. FO 2 over trees is known to correspond 
precisely to the navigational core of the XML query language XPath MdR04 , and 
the satisfiability problem for XPath is known to be complete for EXPTIME; given 



that the translation from FO 2 to XPath is known to be exponential MdR04 , 
this gives a 2EXPTIME bound on satisfiability for FO 2 over trees. 

In this work we will consider the satisfiability problem for FO 2 over finite 
trees, and the corresponding question of the size and depth needed for witness 
models. In particular, we will consider: 

— satisfiability in the presence of all navigational predicates - predicates for 
the parent/child relation, its transitive closure the descendant relation, the 
left- and right- sibling relations and their transitive closures 

— the impact on the complexity of limiting sentences to make use of predicates 
in a particular subset. 

— satisfiability over general unranked trees, and satisfiability in the presence of 
a schema 

— satisfiability over trees where nodes labels are denoted with explicit unary 
labels versus the case where node labels are boolean combinations over 
a propositional alphabet 

We will show that each of these variations impacts the complexity of the 
problem. In the process, we will show that the tree case differs in a number of 
important ways from that of words. First, the complexity of satisfiability no longer 
matches that of FO 2 on general structures - it is EXPSPACE-complete. Secondly, 
the basic technique for analyzing FO 2 on words [EVW02J - bounds on the number 
of quantifier-rank types that occur in a structure - is not useful for getting 
tight bounds on FO 2 over trees. Instead we will use a combination of methods, 
including reductions to XPath, bounds on the number of subformula-based types, 
and a quotient construction that is based not only on types, but on a set of 
distinguished witness nodes. These techniques allow us to distinguish situations 
where satisfiable FO -formulas have models of (reasonably) small depth, and 
situations where they have models of small size. This allows us to get a full 
picture of the complexity of FO 2 satisfiability problems on trees. 

Related work. Two-variable logic on data trees - trees where nodes are 
associated with values in an infinite set- has been studied by Bojanczyk et. al. 
BMSS09 : there the main result is decidability over the signature with data 
equality and the child relation. Figueira's manuscript Figl2 considers two- 
variable logic with the successor relations corresponding to two linear orders, 
which is quite different from considering the two successor relations derived from 
a tree order. Kieronski et. al. show that two- variable logic over two transitive 
relations is undecidable. The complexity of two- variable logic over ordinary trees 
is explicitly studied only in |BK09j . where it is (incorrectly, as we show) stated 
that the complexity of satisfiability remains in NEXPTIME for full two-variable 
logic. 

Organization: Section [2] gives preliminaries. Section [3] gives precise bounds 
for the satisfiability of full FO 2 on trees. Section |4J considers the case where the 
child predicate is absent, while Section [5] considers the case where the descendant 
predicate is absent. Section [6] gives conclusions. 



2 Logics and Models 

We will always use the term "tree" to denote a finite ordered labelled tree, where 
the labels are sets of unary predicates P\ . . .P n . An ordered tree will consist of 
a finite set of nodes, a directed edge relation ParentOf between nodes such that 
the underlying graph forms a tree in the usual sense, a mapping of each Pi to 
a subset of the nodes, and a sibling relation NextSib between nodes that forms the 
successor relation of a linear order when restricted to the set of children of a given 
node. We sometimes write to DescOf n to denote that node to is a descendant of 
node n in a tree, and similarly write to ChildOf n to denote that to is a child of 
n. A tree satisfies the unary alphabet restriction (UAR) if exactly one P, holds of 
each node; in such a tree the labels are just predicates. Given a tree t and node 
n, SubTree(£,n) denotes the subtree of t rooted at n. 

We consider first-order logic sentences in which every subformula has at 
most two variables, allowing the equality predicate as well as relations from the 
following signatures for trees: 

— for general ordered trees, we consider by default a signature Vf u u containing 
predicates for the node predicates Pi, as well as for the ParentOf relation, its 
transitive closure AncOf, the LeftSibOf relation that holds of c and d if c is 
the immediate left sibling of d, and its transitive closure LeftOf . 

— we let VnoAncOf be the vocabulary obtained by removing the descendant 
relation, V par of be the vocabulary obtained by removing all binary relations 
other than ParentOf, V no p a rOf be the vocabulary obtained by removing the 
ParentOf relation, and V anc Of be the vocabulary obtained by removing all 
binary relations other than AncOf . 

We consider fc-ranked trees as a particular class of unranked trees, and thus can 
ask whether an FO sentence in any of the signatures above is true on a ranked 
tree. Note that for fc-ranked trees it is natural to consider signatures that include 
the relation ParentOf i, connecting a node to its i th child for each i < k, either in 
place of or in addition to the predicates above. We will not consider a separate 
signature for ranked trees, since it is easy to derive tight bounds for ranked trees 
for such signatures based on the techniques introduced here. Although we allow 
equality in our upper bounds, it will not play any role in the lower bounds. 

The signatures above used predicates for which the first argument is either 
higher up in the tree than the second argument (ParentOf (c,d) means that c is 
the parent of d) or to the left of the second argument. However, in first-order 
logic, as well as in two-variable first-order logic, we can express the inverse of any 
atomic relation as a formula. Thus we can use formulas x DescOf y, x ChildOf y, 
etc. with the obvious meaning (e.g. x DescOf y meaning AncOf (y,x)). 

For any vocabulary V above, we let FO (V) denote the fragment of first-order 
logic consisting of formulas such that every subformula uses at most two variables. 
When V is omitted it is assumed to be Vf u ii- 

A ranked tree schema consists of a bottom-up tree automaton on trees of 
some rank fc [Tho97J. A tree automaton takes trees labeled from a finite set S. 
We will thus identify the symbols in £ with predicates Pi, and thus all trees 
satisfying the schema will satisfy the UAR. 



We consider the following problems: 

— Given an FO sentence <p and a schema S, determine whether (p is satisfied by 
some tree satisfying S. We consider the combined complexity in the formula 
and schema. 

— Given an FO sentence ip, determine if there is some tree (resp. fc-ranked, 
unary alphabet tree) that satisfies it. 

Some of our results will go through XPath, a common language used for 
querying XML documents viewed as trees. The navigational core of XPath is 
a modal language, analogous to unary temporal logic on trees, denoted NavXP. 
NavXP is built on binary modalities, referred to as axis relations. We will focus 
on the following axes: self, child, descendant, descendant-or-self, ancestor-or-self, 
next-sibling, following-sibling, preceding-sibling, previous-sibling. In a tree t, we 
associate each axis a with a set R f a of pairs of nodes. i?* hi | d denotes the set of 
pairs of nodes (x, y) in t where y is a child of x, and similarly for the other axes 
(see |Mar04j ). 

NavXP consists of path expressions, which denote binary relations between 
nodes in a tree, and filters, denoting unary relations. Below we give the syn- 
tax (from |BK09j). using p to range over path expressions and q over filters. 
L ranges over symbols for each labelling of a node (i.e. for general trees, boolean 
combinations of predicates Pi . . . P n , for UAR trees a single predicate). 

p ::= step \ p/p \ p U p step ::= axis \ step[q] 

q::=p\ labQ = L \ q A q \ qV q \ ^q 

where axis relations are given above. 

The semantics of NavXP path expressions relative to a tree t is given by: 
1. [axis] = Ri Kis 2. [stepfe]] = {(n,n') e [step] : n' € [<?]} 3. [piM] = 
{(n,n') : 3w(n,w) e [ Pl ] A (w,v) e [p 2 ]} 4. \ Pl U p 2 J = [ Pl ] U [p 2 ]. 

For filters we have: 1. {lab() — L\ = {n : n has label L} 2. [p] = {n : 
3n' (n,n') e [p]} 3. \ qi A q 2 \ = M n [q 2 ] 4. [-ng](n) = {n : n $ [q]}. A NavXP 
filter is said to hold of a tree t if it holds of the root under the above semantics. 

Marx and De Rijke showed an expressive equivalence of NavXP and FO 2 , 
extending the translation to Unary Temporal Logic in the word case: 

Proposition 1. [MdROJ^] There is an exponential translation from FO 2 to NavXP 
with all axis and from FO [V anc o/] to NavXP with only the descendant and an- 
cestor axes. 

Marx has shown that NavXP has an exponential time satisfiability problem 
[Mar04 . From this and the above proposition, we get the following (implicit in 
|MdR04Q : 

Corollary 1. The satisfiability problem for FO 2 is in 2EXPTIME. 

3 Satisfiability for full FO 2 

Subformula types and exponential depth bounds. In the analysis of satis- 
fiability of FO 2 for words of Etessami, Vardi, and Wilke |EVW02j . a NEXPTIME 
bound is achieved by showing that any sentence with a finite model has a model 



of at most exponential size. The small model property follows, roughly speaking, 
from the fact that any model realizes only exponentially many "quantifier-rank 
types" - maximal consistent sets of formulas of a given quantifier rank - and the 
fact that two nodes with the same quantifier-rank type can be identified. 

In the case of trees, this approach breaks down in several places. It is easy to 
see that one cannot always obtain an exponential-sized model, since a sentence 
can enforce binary branching and exponential depth. Because there are doubly- 
cxponcntially many non-isomorphic small-depth subtrees, there can be doubly- 
exponentially many quantifier-rank types realized even along a single path in 
a tree: so quantifier-rank types can not be used even to show an exponential 
depth bound. We thus use subformula types of a given FO -formula ip (for short, 
<f -types) - these are maximal consistent collections of one- variable subformulas of 
ip. The tp-type of a node n in a tree, Tp (n), is defined as the set of subformulas 
of <p it satisfies. The number of tp-types is only exponential in \ip\, but subformula 
types are more delicate than quantifier-rank types. E.g. nodes with the same 
ip-type cannot always be identified without changing the truth of ip. Most of the 
upper bounds will be concerned with handling this issue, by adding additional 
conditions on nodes to be identified, and/or preserving additional parts of the 
tree. 

Upper bounds for FO 2 . We exhibit the issues arising and techniques used 
to solve them by giving an upper bound for the full logic, FO , which improves 
on the 2EXPTIME bound one obtains via translation to modal logic. 

Theorem 1. The satisfiability problem for FO 2 is in EXPSPACE. 

The key to the proof is to show the "exponential depth property" : 

Lemma 1. Every satisfiable FO sentence <p has a model T" where the depth is 
bounded by 2 polv ^ v '' , and similarly for satisfiability w.r.t UAR trees or ranked 
schemas. The outdegree of nodes can also be bounded by 2 polv ^ tp " . 

We give the argument for the depth bound, leaving the similar proof for the 
branching bound to the appendix. Given a tree t and nodes no and n\ in t with 
n\ not an ancestor of no, the overwrite of no by n\ in t is the tree t(n,\ — ► n ) 
formed by replacing the subtree of no with the subtree of n\ in t. Let F be 
the binary relation relating a node m in t to its copies in t{n\ — > no): ni and 
its descendants have a single copy if n\ is a descendant of no, and two copies 
otherwise; nodes in SubTree(t, no) that are not in SubTree(£,ni) have no copies, 
and other nodes have a single copy. In the case that n\ is a descendant of no, F is 
a partial function. We say an equivalence relation = on nodes of a tree t is globally 
tp-preserving if for any equivalent nodes no,n\ in t with no $. SubTree(t, n{), 
the ip-type of a node n in t is the same as the <p-type of nodes in F(n) within 
t(ni — l no). We say it is pathwise ip-preserving if this holds for any node no,ni 
in t with n\ a descendant of n . The path-index of an equivalence relation on t 
is the maximum of the number of equivalence classes represented on any path, 
while the index is the total number of classes. 

We can not always overwrite a node with another having the same ip-type, but 
by adding additional information, we can get a pathwise (^-preserving relation 



with small path-index. For a node n, let DescTypes(n) be the set of ^-types 
of descendants of n, and AncTypes(n) the set of ip-types of ancestors of n. Let 
IncompTypes(n) be the p- types of nodes n' that are neither descendants nor 
ancestors of n. Say uq =f u ii "-i if they agree on their ip-type, the set DescTypes, 
the set AncTypes, and the set IncompTypes. 

Lemma 2. The relation =Fuii * s pathwise ip -preserving, and its path index is 
bounded by 2 poly ^ <p '> . Thus, there is a polynomial P such that for any tree t 
satisfying (p and root-to-leaf path p of length at least 2 P ^ V '' , there are two nodes 
no,n± on p such that t{n\ — > no) still satisfies ip. Given a tree automaton A, it 
can be arranged that A reaches the same state on no as on n\ . 

Given Lemma [2l Lemma IT] follows by contracting all paths exceeding a given 
length until the depth of the tree is exponential in \p>\. In fact (e.g., for ranked 
trees) =f u ii can be used as the state set of a tree automaton. The path index 
property implies that the automaton goes through only exponentially many states 
on any path of a tree. By taking the product of this automaton with a ranked 
schema, the corresponding depth bound relative to a schema follows. 

We give the simple argument for the path index bound in Lemma [21 leaving 
the proof that =Fuii is pathwise (^-preserving to the appendix. First, note that the 
total number of (/3-types is exponential in \p\. Now the sets DescTypes(n) either 
become smaller or stay the same as n varies down a path, and hence can only 
change exponentially often. Similarly the sets IncompTypes(n) and AncTypes(n) 
grow bigger or stay the same, and thus can change only exponentially often. 
In intervals along a path where both of these sets are stable, the number of 
possibilities for the ip-type of a node is exponential. This gives the path index 
bound. 

Theorem [lj follows from combining Lemma [TJ with the following result on 
satisfiability of NavXP: 

Theorem 2. The satisfiability of a NavXP filter p> over trees of bounded depth b 
is in PSPACE (in b and \<p\). 

The result is proved in the appendix, but it is a variant of a result from BFG08 
that finite satisfiability for the fragment of NavXP which contains only axis rela- 
tions child, parent, next-sibling, preceding-sibling, previous-sibling and following- 
sibling is in PSPACE. Given Theorem [2] we complete the proof of Theorem [T] 
by translating an FO 2 sentence ip into an NavXP filter ip' with an exponential 
blow-up, using Proposition [TJ By Lemma [TJ the depth of a witness structure is 
bounded by an exponential in \ip\, and the EXPSPACE result follows. 

Lower bound. We now show a matching lower bound for the satisfiability 
problem. 

Theorem 3. The satisfiability problem for FO 2 is EXPSPACE-hard, with hard- 
ness holding even when formulas are restricted to be in FO [V Qnc o/]- 

This is proved by coding the acceptance problem for an alternating exponential 
time machine. A tree node can be associated with an n-bit address, either by 
using multiple predicates (for F0 2 [V anc o/]) or via children. The equality and 
successor relations between the addresses associated to nodes x and y can be 



coded in FO 2 using the standard argument (see the NEXPTI ME- hardness proof 
of [EVW02 ) . A path corresponds to one thread of the alternating computation, 
and the tree structure is used to code alternation. 

4 Satisfiability without child 

The exponential depth bound revisited. As noted in the previous section, 
the satisfiability problem is still EXPSPACE-complete even when the ChildOf 
relation is removed. However, we take a closer look at this case, noting some 
connections with other logics and some further restrictions that lower the com- 
plexity. 

We first consider the relationship of FO 2 without child to modal tree languages. 

Let downward stutter-free NavXP, denoted DownSF-NavXP, be the fragment 
of NavXP obtained by restricting to the descendant, ancestor, and all sibling 
axes. The complexity of satisfiability DownSF-NavXP has not been studied in 
prior work, including RFG08 , but we can show the following depth bound for 
DownSF-NavXP: 

Theorem 4. Every satisfiable DownSF-NavXP sentence has a model of polyno- 
mial depth. The satisfiability problem for DownSF-NavXP is PSPACE-complete. 

The proof resembles the result that a satisfiable stutter-free temporal logic 
formula has a model of polynomial size. Some care needs to be taken to deal with 
the sibling axes, which allow a DownSF-NavXP formula to look off of a given 
path. 

This result shows that tight bounds for two-variable logic without child can 
actually be obtained via translation to modal languages: Combining the first 
part of Theorem [4] and the translation to NavXP from Proposition [TJ we get 
an alternative proof of the exponential depth bound in Lemma [Tl as well as the 
EXPSPACE upper bound for satisfiability, in the special case of FO 2 [VnoParOf] ■ 

Unary Alphabet Restriction, polynomial alternation bounds, and 
polynomial depth bounds. The previous section showed EXPSPACE-complete- 
ness for satisfiability of F0 2 [V anc q/]- However the EXPSPACE-hardness argument 
for V anc of makes use of multiple predicates holding at a given node, to code the 
address of a tape cell of an alternating EXPTIME Turing Machine. It thus does 
not apply to satisfiability over Unary Alphabet Restriction trees (as defined in 
Section [2| or to satisfiability with respect to a schema, since schemas restrict to 
a single alphabet symbol per node. We show that the complexity of satisfiability 
is actually "lower" (that is, modulo the assumption NEXPTIME ^ EXPSPACE) 
when the UAR is imposed, using distinct techniques for the case of ranked and 
unranked trees. 

We start by noting that one always has at least NEXPTI ME-hardness, even 
with UAR. 

Theorem 5. The satisfiability of FO {V a ncOf\ with the unary alphabet restriction 
is NEXPTI ME-hard, and similarly with respect to a ranked schema. 

The proof is a variation of the argument for NEXPTIME hardness for words 
[EVW02 , but this time using the frontier of a shallow but wide tree to code the 
tiling of an exponential grid. 



We will prove a matching NEXPTIME upper bound for UAR trees and for 
satisfiability with respect to a ranked schema. To do this, we extend an idea 
introduced in the thesis of Philipp Weis [Weillj , working in the context of FO 2 [<] 
on UAR words: polynomial bounds on the number of times a formula changes its 
truth value while keeping the same symbol along a given path. 

The following is a generalization of Lemma 2.1.10 of Weis |Weillj . 

Consider an FO [V anc o/] formula ip{x), a tree t satisfying the UAR, and fix 



a root-to-leaf path p = p\ 



<:(}>) 



in t. Given a label a, define an a-interval in 



p to be a set of the form {i : mi < i < m^ t,Pi \= a(x)}. 
Lemma 3. For every FO [V anc q/] formula ^{x), UAR tree t, and root-to-leaf 
path p — pi . . .p m ax(p) i n t, the set {i\ t,Pi (= tj) A a(x)} is made up of at most 
\ip\ a-intervals. 

From Lemma KBl we will show that FO [V a ncOf\ sentences that are satisfiable 
over UAR trees always have polynomial-depth witnesses: 

Lemma 4. // an FO [V orac o/] formula ip is satisfied over a UAR tree, then it is 
satisfied by a model of depth bounded by a polynomial in \(p\. 

Proof. Suppose that (p is satisfied over a UAR tree t. On each path p, for each 
letter b, let a b, (^-interval be a maximal 6-interval on which every one- variable 
subformula of tp has constant truth value. By the lemma above, the total number 
of such intervals is polynomially bounded. We let W contain the endpoints of 
each &, ^-interval for all symbols b. We note the following crucial property of W: 
for every node m in p which is not in W, there is a node in W with the same 
ip-type as m that is strictly above m, and also one strictly below m. 



path p 




Fig. 1. Tree Promotion 



The idea is now to remove all those points on path p that are not in W. This 
must be done in a slightly unusual way, by "promoting" subtrees that are off 
the path. For every removed node r, for every child c of r not on p, we attach 
the subtree rooted at c to the closest node of W above r (see Figure [l]) . Let t! 
denote the tree obtained as a result of this surgery. Formally, the nodes of if are 
all nodes of t that are not in p or are in W . Each such node has the same label 
that it had in t. For any node m in t with parent n, if both m and n are in if 



then n is again the parent of m in if . On the other hand, if only to is in t' then 
its parent in if is its lowest ancestor in W . 

Let / be the partial function taking a node in t that is not removed to its 
image in i! . We claim that t' still satisfies ip, and more generally that for any 
subformula p(x) of tp and node to of t, we have i, m |= p iff if , /(to) |= p. This is 
proved by induction on p, with the base cases and the cases for boolean operators 
being straightforward. For an existential formula 3y/3(x, y), we give just the "only 
if" direction, which is via case analysis on the position of a witness node w such 
that t,m,w \= 0. 

If w is in if then f , to, w \= j3 by the induction hypothesis and the fact that 
w is an ancestor (or descendant) of to in t' if and only if it is an ancestor (or 
descendant) of m in t. 

If w is not in £', then it must be that w lies on the path p and is not one the 
protected witnesses in W. But then w has both an ancestor w' and descendant 
w" in W that satisfy all the same one-variable subformulas as w does in t, 
with both w' and w" preserved in the tree t' . If to and w" are distinct then 
t' ,m,w" \= P by the induction hypothesis and the fact that to and w" have 
the same ancestor/descendant relationship in if as do to and w in t. If 771 is 
identical to w" then if ,m,w' |= /3 by similar reasoning. In any case we deduce 
that t',m (=3j/j0. 

Since this process reduces both the length of the chosen path p and does not 
increase the length of any other path, it is clear that iterating it yields a tree of 
polynomial depth. 

Note that we can guess a tree as above in NEXPTIME, and hence we have 
the following bound: 

Theorem 6. Satisfiability for FO [VancOf] formulas over UAR unranked trees 
is in NEXPTIME, and hence is NEXPTIME-comptete. 

Bounds on subtrees and satisfiability of F0 2 [V anc o/] with respect 
to a ranked schema. The collapse argument above relied heavily on the fact 
that trees were unranked, since over a fixed rank we could not apply "pathwise 
collapse". Indeed, we can show that over ranked trees, a F0 2 [V Qnc o/] formula 
satisfiable over UAR trees need not have a witness of polynomial depth: 

Theorem 7. There are FO [V anc o/] formulas ip n of size 0(n) that are satisfiable 
over UAR binary trees, where the minimum depth of satisfying UAR binary trees 
grows as 2 n . 

Nevertheless, we can still obtain an NEXPTIME bound for UAR trees of 
a given rank, and even for satisfiability with respect to a ranked schema. 

Theorem 8. The satisfiability problem for FO [V anc o/] over ranked schemas is 
in NEXPTIME, and is thus NEXPTIME-compZete. 

We give the argument only for satisfiability with respect to rank-fc UAR 
trees, leaving the extension to schemas for the appendix. This will also serve 
as an alternative proof of Theorem |6J The idea will be to create a model with 
only an exponential number of distinct subtrees, which can be represented by 



an exponential-sized DAG. We do this by creating an equivalence relation that is 
globally ^-preserving (not just pathwise) and which has exponential index (not 
just path index). We will then collapse equivalent nodes, as in Lemma [21 There 
are several distinctions from that lemma: to identify nodes that are not necessarily 
comparable we can not afford to abstract a node by the set of all the types realized 
below it, since within the tree as a whole there can be doubly-exponentially many 
such sets. Instead we will make use of some "global information" about the tree, 
in the form of a set of "protected witnesses" , which we denote W. 

By Lemma 1 we know that a satisfiable FO [V anc o/] formula ip has a model 
t of depth at most exponential in tp. Fix such a t. For each tp-type t, let w T 
be a node of t with maximal depth satisfying r. We include all w T and all of 
their ancestors in a set W, and call these basic global witnesses. For any to 
that is an ancestor or equal to a basic global witness w T , and any subformula 
p(x) = 3y(3(x, y) of p, if there is w' incomparable (by the descendant relation) to 
m such that t, to, w' |= /3 we add one such w' to W, along with all its ancestors - 
these are the incomparable global witnesses. 

We need one more definition. Given a node to in a tree, for every p-type 
t realized by some ancestor ml of to, for every subformula 3y/3(x,y) of t, if 
there is a descendant w of to such that t,m',w \= f3(x,y), choose one such 
witness w and let Selected DescTypes(m) include the p-type of that witness. Note 
that the same witness will suffice for every ancestor ml realizing r, and since 
there are only polynomial many </?-types realized on the path, the collection 
Selected DescTypes(TO) will be of polynomial size. 

Now we transform t to t' such that if \= (p and t! has only exponentially 
many different subtrees. We make use of a well-founded linear order -< on trees 
with a given rank and label alphabet, such that: 1. SubTree(i,n') -< SubTree(i, n) 
implies n' is not an ancestor of n; 2. for every tree C with a distinguished leaf, 
for tree t 1: t 2 with t\ ~< t 2 , we have C\t\\ -< C\p2\, where C[ti] is the tree obtained 
by replacing the distinguished leaf of C with fcj. There are many such orderings, 
e.g. using standard string encodings of a tree. 

For any model t if there are two nodes n,n' in t such that 1. n,n' (jL W, 
2. Tp^(n) = Tp (n'), 3. AncTypes(n) = AncTypes(n'), 4. Selected DescTypes(n) 
= SelectedDescTypes(n'), 5. SubTree(t, nf) -< SubTree(i,n) (which implies that 
n' cannot be an ancestor of n), then let t' = Update(i) be obtained by choosing 
such n and n' and replacing the subtree rooted at n by the subtree rooted at n' . 

Let T\ be the nodes in t that were not in SubTree(£,n), and for any node 
to € T\ let /(to) denote the same node considered within if. Let T 2 denote the 
nodes in if that are images of a node in SubTree(i, n'). For each to € T 2 , let 
/ _1 (to) denote the node in SubTree(i,n') from which it derives. 

We claim the following: 

Lemma 5. For all to <E T± the ip-type of n in t is the same as the p-type of 
/(to) in t' . Moreover, for every node ml in T 2 , the p-type of m' in t' is the same 
as that of / _1 (m) in t. 

Applying the lemma above to the root of t, which is necessarily in T\, it 
follows that the truth of the sentence p is preserved by this operation. 



We now iterate the procedure ti+i := Update(ti), until no more updates are 
possible. This procedure terminates, because the tree decreases in the order -< 
every step. We can thus represent the tree as an exponential-sized DAG, with 
one node for each subtree. 

Thus we have shown that any satisfiable formula has an exponential-size 
DAG that unfolds into a model of the formula. Given such a DAG, we can check 
whether an FO 2 formula holds in polynomial time in the size of the DAG. This 
gives a NEXPTIME algorithm for checking satisfiability. 

5 Satisfiability without descendant 

Recall that even on words with only the successor relation, the satisfiability 
problem for two-variable logic is NEXPTIME-hard [EVW02 . From this it is easy 
to see that the satisfiability for F0 2 [V par o/] is NEXPTIME-hard, on ranked and 
unranked trees. 

Theorem 9. The satisfiability problem for F0 2 [V par o/] is NEXPT\ME-hard, 
even with the unary alphabet restriction. 

We now present a matching upper bound, which holds even in the presence of 
sibling relations, i.e., for FO [V n oAncOf\- The result is surprising, in that it is easy 
to write satisfiable F0 2 [V par o/] sentences ip n of polynomial size whose smallest 
tree model is of depth exponential in n, and whose size is doubly exponential. 
Indeed, such formulas can be obtained as a variation of the proof of Theorem [91 
by coding a complete binary tree whose nodes are associated with n-bit numbers, 
increasing the number by 1 as we move from parent to either child. 

The result below relies on the fact that one can witness the satisfiability of 
a given formula by an exponential-sized DAG. 

Theorem 10. The satisfiability problem for FO [VnoAncOflr an d the satisfiabil- 
ity problem with respect to a rank schema, are in NEXPTIME, and hence are 
NEXPTME-complete. 

We sketch the idea for satisfiability, which iteratively quotients the structure 
by an equivalence relation, while preserving certain global witnesses, along the 
lines of Theorem ^l By Lemma 111 we know that a satisfiable F0 2 [V noAncOf] 
formula <p has a model t of depth at most exponential in ip, where the outdegree 
of nodes is bounded by an exponential. 

For each ip-type that is satisfied in t, choose a witness and include it along 
with all its ancestors in a set W - that is, we include the "basic witnesses" as in 
Theorem [8j We also include all children of each basic witness - call these "child 
witnesses" . 

Thus the size of the set of "protected witnesses" W is again at most expo- 
nential. Now we transform t to t' such that t! |= ip and at the same time t' 
has only exponentially many different subtrees. Our update procedure looks 
for nodes n,n' in t such that 1. n,n' $• W; 2. SubTree(£,n') -< SubTree(£,n), 
where -< is an appropriate ordering (as in Theorem[8|; 3. Tp v (n) = Tp (/ ,(n') and 
Tp^ (parent (n)) = Tp v (parent(n')). We then obtain t' = Update(t) by choosing 
such n and n' and replacing SubTree(t,n) by SubTree(£,n'). 



The theorem is proved by showing that this update operation preserves ip. 
Iterating it until no two nodes can be found produces a tree that can be represented 
as an exponential-size DAG. 

6 Conclusions 

We have shown that the parallel between the complexity of FO 2 satisfiability 
on general structures and on restricted structures breaks down as we move 
from words to trees - trees allow one to encode alternating exponential time 
computation, leading to EXPSPACE-hardness. On the other hand, we show that 
analogs of the "model shrinking" methods for FO 2 on words exist for trees, 
albeit using a different shrinking technique. In future work, we are extending the 
analysis to infinite trees, where we believe it can be useful for analyzing branching 
time properties of both non-deterministic and probabilistic systems, as was done 
for linear time in [BLW12| . We are also considering the case of structures of fixed 
tree- width. 

Our main complexity results on satisfiability are summarized in Table [6j 
where in each case the bound is tight. 





FO^ 


F0 2 [V a „ cO/ ] 


FO [VnoParOf] 


FO [V par o/] 


All Trees 

w.r.t. Ranked Schema 


EXPSPACE 
EXPSPACE 


EXPSPACE 
NEXPTIME 


EXPSPACE 
EXPSPACE 


NEXPTIME 
NEXPTIME 
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More detail on the proof of Lemma [T] and Lemma [2] 

We first give a detailed proof of the following statement from Lemma [2] 

The equivalence relation =Fuii is pathwise <p -preserving. 

Fix tree t and no =Fuii n\ lying on the same path p in t, with n± a descendant 
of no- Let t' be formed by overwriting no with m, and / be the mapping taking 
a node that lies in the subtree of n-± or outside of the subtree of n to its image 
in t'. By the "collapsed part of t" we refer to the part of t not in the domain of /. 

We prove via structural induction that for every subformula p of ip and node 
m in the domain of / we have t,m \= p -n- t', f(m) |= p. The atomic cases and the 
boolean operators are clear, so existential quantification is the only non-trivial 
case. 

Consider first a node m in the bottom half of the non-collapsed structure 
- that is, in SubTree(t, n{) - satisfying p(x) — 3y/3(x,y). By induction we need 
consider only the case where some node w witnessing that m satisfies p in t is 
not in the domain of /. Fix such a witness node w. We show that we can find 
a node that satisfies the same one-variable subformulas of p that w does, and 
which satisfies the same axis relations with respect to m that w does. 

When the witness to the existential quantifier in p is a parent of m, then we 
must have m = n\. Now we can apply the hypothesis that the y-type of no is the 
same as the y-type of n\, plus the induction hypothesis, to conclude that /(to) 
must satisfy p. The case in which the witness w is a descendant of m or equal 
to to need not be considered, since such a witness must be in the domain of /, 
which is ruled out by assumption. Now consider the case where some witness w is 
an ancestor of to, but not a parent. Such a w must be on the path p. In this case, 
we can use the fact that AncTypes(no) = AncTypes(ni) to argue that a witness 
can be found. Suppose there is a node w witnessing that t,m \= p{x) such that 
w is not an ancestor or a descendant of m. Then we can apply the fact that 
IncompTypes(no) = IncompTypes(ni) to find a witness w' that is incomparable of 
no, but still in the domain of /. Such a w can be used (by induction) as a witness 
that t'J(m) \= p. 

We now move to the case where to is in the top half of the non-collapsed 
structure satisfying p(x) = 3y/3(x,y). We are interested in the case where all 
witnesses w to the existential quantifier in p are in the collapsed part of the 
structure, and hence are not ancestors of m. 

Suppose we have a witness that is not a descendant or ancestor of to. The 
witness must be a descendant of no, and no must not be a descendant of m. We 
can apply again the fact that DescTypes(no) = DescTypes(ni) to find a witness 
w' below ni, which will suffice by induction. 

If the witness w is in the collapsed part of t and is a child of m, we must 
have m = n , and hence we can use the fact that Tp v (n ) = Tp (ni) to get the 
desired witnessed. Now suppose we have a witness w in the collapsed part of the 
structure, with w a descendant of m but not a child of m. Again, if m = no we 
are done, using the fact that Tp v (no) = Tp (ni). If m ^ n , we must have m 
is a strict ancestor of n . From DescTypes(n ) = DescTypes(ni) we know that 



there is a descendant w' of m with the same </?-type as 10. Since m ^ uq w' is 
not a child of m in £', and hence can serve as a witness. 

The cases for the sibling axes are also straightforward, since no nodes in the 
domain of / have their siblings modified by the collapse mapping. 

We now explain the variation of the argument for the exponential bound on 
branching. Note that NavXP queries can already force exponential branching, 
and thus the result does not follow directly via translation to modal tree logics. 
In a nutshell, we use the same approach, but shrinking horizontal rather than 
vertical paths. 

Construction: Consider the equivalence relation that relates two nodes if 
they have: 

— the same tp-types that occur as left-siblings, and the same (p-types that occur 
as right-siblings 

— the same ip- types of nodes that are descendants of right-siblings, and similarly 
for left-siblings 

— the same <y9-types, and the same i^-types immediately to the right and imme- 
diately to the left 

Recall that the right-sibling relation is the transitive closure of the immediate 
right-sibling relation, and similarly for left-sibling. Note that the first two items 
change only exponentially many times, and on an interval where they are both 
constant, the third item takes on only exponentially many values. 

We now claim that any sufficiently long horizontal path can be pruned. Fix 
a horizontal path p containing all children of some node. If p is sufficiently long, 
there is some equivalence class C that has more than one node in it. Let nf be 
the left-most (lowest in sibling order) element of C, and n the element of C that 
is closest to it on the right. Let t' be obtained by removing all subtrees of nodes 
between n' and n, including the subtree of n but not the subtree of n' . 

Correctness: Let / be the function taking a node in t that was not removed 
by the operation above (for short "non-removed node") to its image in t' . As usual, 
we proceed by showing that y>- types are preserved in moving from a node to to 
f(m). As before, the only important case is the inductive step for p(x) — 3y/3(x, y), 
with the non-trivial direction being to show that if p holds at t, m then it holds 
in £', /(to). Suppose m satisfies p, with witness w. The interesting case is when 
w is a removed node, which means it must either be a right-sibling of n! that was 
removed or below a right-sibling of n! that was removed. We do case analysis on 
the relationship of w to to. 

Case of Incomparable Witnesses: If w is incomparable to m by both the 
sibling and ancestor relations, then we consider several subcases. 

The first subcase is where to is "below a node in p" - that is, a descendant of 
some node on p. Let n" be the node of p that is an ancestor of m. 

We further consider the subsubcase where the sibling n" is to the right of n. 
If w is a right-sibling of n', then it was a left-sibling of n or is equal to n, since 
these are the siblings that are removed. In the first case, it must be that n' has 
a left-sibling w' with the same ip-type as w. Since m is "down and to the right" 



(that is, below a right-sibling) of n', w' is incomparable to m, and thus such a u>' 
can be used as a witness that tf,f(m) |= p. Similarly, in the case that w was 
equal to n, n' can be used as a witness. If w is below a right-sibling of n', it must 
be that n' has a left-sibling that has a descendant with the same <y9-type, and 
this can be used as a witness. 

The paragraph above completes the subsubcase where n" is to the right of n. 
If n" is to the left of, or is equal to, n' , we argue symmetrically, but considering 
the ip-types that are right-siblings or descendants of right-siblings of n. 

The subcase where to is itself a sibling of n is similar to the above, except w 
can not be a sibling of to, and hence one subcase does not need to be considered. 

The final subcase is where to is not on p and is not a descendant of a node in 
p. Note the assumption that w is incomparable to to and removed during the 
collapse process, and hence w lies below a node on the horizontal path p. This 
implies that to can not be an ancestor of the nodes in p. If w is a sibling of a node 
in p that was removed, we can use any non-removed sibling of n' with the same 
yj-type as a witness (there are at least two such nodes, to the left and right). 
Similarly if w is below a sibling of a removed node of p, we use any non-removed 
node that has the y-type of w and which is a descendant of a node on p. 

Other cases: The case where the witness w is a descendant of to is similar 
to the last subcase above. In this case, to must be an ancestor of the nodes on 
p. Again, if w is a sibling of n, we can choose a sibling with the same ip-type. If 
w is a descendant of a sibling, we can choose a descendant of a sibling with the 
same ip-type. 

We now turn to the case where w is an immediate left-sibling of to. In this 
case we must have m — n, and we can use the fact that n and n' have the 
same ip-type for their immediate left-siblings. The case where w is an immediate 
right-sibling of to is analogous. 

The case where w is a following-sibling but not the next-sibling, or a preceding- 
sibling but not the previous-sibling, is handled similarly to above. 

Iterating this pruning process gives the required branching bound. 

Proof of Theorem [2] 

Recall the statement: 

The satisfiability of a NavXP filter tp over trees of bounded depth b is in 
PSPACE (in b and \cp\). 

It can be awkward to work with NavXP, since one has to switch back between 
two- and one- variable formulae. For simplicity, we work with a temporal logic 
UTLtree for trees analogous to Unary Temporal Logic on words, introduced in 
[LS08 . Formulas ip are given by: 

tp ::= Pi | ip A ip | -up | 0*f I "0*¥> I 0*P | 0*</? 

where * stands for either a child (CH) relation or a next-sibling relation (NS). 
Informally OchV 9 i s "eventually along a vertical path ip holds", <$>ch is "up 
the vertical path to the root", Och is "in some child" and 0ch "in the par- 
ent". The variants for NS are defined similarly for horizontal paths. The se- 
mantics of UTLtree with respect to a tree T and node s is given as a variant 



of the standard semantics for linear temporal logic on words. For example 
(T, s) |= Pi ■<=>■ s has label Pj. The boolean operators have their usual recur- 
sive definition. (T,s) (= Och<P '^ = ^ 3s' such that s' ChildOf s and (T, s') \= (f, 
and similarly for the other next state modalities. 

The above semantics maps a formula to a set of nodes in a tree. For a tree t, 
we say t \= ip to mean (£, no) (= V where no is the root. 

LS08] shows that NavXP can be translated in polynomial time into UTL tre e- 

We give a non-deterministic PSPACE algorithm that constructs a witness tree 
for tp, materializing only the rightmost branch of the tree. As an abstraction of 
this branch the algorithm guesses all the ^-types of nodes appearing on the path 
to the root, along with auxiliary information about whether a node is the last 
child of its parent, and which subformulas of the form OchV' and OchV' have 
been satisfied. 

We require all guessed types to be internally consistent, and to satisfy certain 
consistency properties. Additionally, we require ip to be in the type of the root. 

Now we show how to check the consistency for all temporal subformulas. 

1. Subformulas ©chV' and ^chV' are the easiest to check, because for each node 
we have already guessed all its ancestors. 

2. When we extend a path downward (corresponding to guessing the type of 
the initial child), we require that all subformulas OnsV' are false and that 
the truth value of OnsV" is equivalent to truth value of tp. When we move 
from a leaf I of a path to its sibling, we enforce that the new type contains 
< $ > nsV' if I contains it, and that it contains ©nsV' iff I contains ip. 

3. When we move to a sibling of I, if I contains OnsV 7 ! we ensure that the type 
of the newly-created sibling contains ip. For Ons^j we guess that its sibling 
contains i/j or Ons^- If we guess that a leaf is the rightmost sibling, we check 
that its type does not contain OnsV'- 

4. For subformulas Och^ and Och^j we mark whether they have already been 
satisfied by some prior descendant. If not, we decide when we extend the 
path whether or not they will be satisfied on the new child, and guess the 
type accordingly. When we move from a leaf I to its sibling, we require that 
every such formula that was in I has been marked as satisfied. 

Proof of Theorem [3] 

Recall the statement: 

The satisfiability problem for FO is EXPSPACE-Ziard, and the same holds for 
F0 2 [V Q „ cO /]. 

We first give the argument for FO 2 . We reduce from the problem of determin- 
ing whether an alternating EXPTIME Turing Machine T accepts a given input /. 
Without loss of generality we assume that each configuration of T has exactly 
two successors. We can also assume that for an input of size n, the computation 
of T takes at most 2™ steps and therefore uses at most 2™ tape cells. We give a 
polynomial time transformation that takes T and machine input /, returning an 
FO 2 formula <p which is satisfiable if and only if T accepts /. 

We encode each tape configuration as a sequence of 2™ nodes with one node 
per cell. Each cell will have a label encoding: 



— the tape symbol written on the cell 

— the time step (or "index" ) of the configuration, encoded in n bits c\ , Oi , . . . c„ 

— the cell position encoded in n bits pi,P2,- ■ -Pn 

— the control state of the Turing Machine 

— the last alternation choice, which is either A or V 

— whether the head of the Turing Machine is present 

The computation of T will be described by a tree of tape computations 
starting with an initial configuration. Intuitively the formula if will force the 
shape of the tree to match that of the computation tree for T. In more detail, 
an A-configuration will be represented in the tree by a path of 2™ nodes that 
terminates in a node with two children, each of which is the root of a successor 
configuration. On the other hand an V-configuration is represented by a path 
of 2™ nodes that terminates in a node with a single child, which is the root of a 
single successor configuration. The vocabulary of the formula will have predicates 
for the presence or absence of the Turing machine head, the alternation choice, 
the tape alphabet symbols, and predicates indicating which of ci, . . . c n ,pi, . . .p n 
hold. 

Now we discuss in more detail the parts of (p that will ensure the structure 
described above. The tree should have as root a node whose index is a vector 
of zeros for the values of C\, . . . c n ,pi, . . .p n , after which we need to increase the 
number represented by this vector by one for each child node. Within the same 
configuration the latter can be easily enforced by the following formula: 

Vx Vy (y ChildOf x) -S- \J(->Pi(x) A Pl {y) f\ Pj (x) «• Pj (y) A f\ pj(x) A ->Pj(y)) 

i j<i j>i 

We can use the predicates Cj and pi (and formulas similar to the one above) to 
determine whether two nodes x and y corresponding to tape cells in a configuration 
of T represent the same, previous or next position within the same configuration, 
or whether they are in the same, previous, or next configuration. For example, 
two nodes that represent successive configurations in a single thread of a machine 
will need to be in the DescOf relation, and will have configuration co-ordinates 
that are in a successor relation, which will be enforced as above, but using the c, 
rather than the Pi. 

To encode the alternation, we need to enforce that the shape of a node is 
consistent with the type of the current configuration, in terms of whether the 
state is universal or existential. For example, if we have a universal state q and 
a transition to control states q\ and q 2 , after the last cell of the configuration 
we will enforce that there is a child whose control state is q\ and another child 
whose control state is q 2 . 

We have a formula ij){x,y) that checks the consistency of the tape cells 
represented by nodes x and y that are in a descendant relationship (and hence 
represent the same thread in the alternating computation) . If x and y point to 
the same cell position in consecutive configurations then we need the content of 
x, y and their adjacent cells to be consistent with the transition function of T, 



the position of the head, the current state, the cell symbols and the alternation 
type (A vs V). 

The enforcement that the input is on the tape initially, and that an acceptance 
state is reached at each leaf, can similarly be easily enforced. 

Extension of the argument from FO 2 to F0 2 [V por o/]. In the proof above 
we use only the DescOf and ChildOf relations. We now show how to avoid ChildOf. 
The key is that we do not need consecutive positions within the same configuration 
to occur in a parent child relationship. Along any thread, we can uniquely identify 
via the predicates c± . . . c n and pi . . .p n . We can thus consider nodes correspond 
to consecutive positions in the same configuration using these predicates, while 
using DescOf to restrict to nodes within the same thread. 

We will enforce that 

— each descendant of any node has a larger configuration index 

— each node (except the first) has an ancestor whose configuration address is 
smaller by one 

— each node is either a representative of the last configuration in its thread (i.e. 
with maximal configuration index) or it has a descendant whose configuration 
index is higher by one 

We have similar requirements for the position indices for the same configuration. 

Proof of Theorem |4] 

Recall the key statement: 

Every satisfiable DownSF-NavXP sentence has a model of polynomial depth. 

Again, since it is more convenient to deal with one-variable formula than a 
mix of two- and one-variable as in NavXP, we will prove this for the modal tree 
logic formed from UTL tre e by removing the child and parent modalities (but 
including the next- and previous- sibling modalities). Call the resulting language 

Consider a satisfiable TL tree formula cp, a tree t satisfying (/?, and a path p in 
t. We will shrink p to polynomial size without impacting ip, and iterating this 
process we can achieve polynomial depth. Once we achieve polynomial depth, we 
can use Theorem [2] to get a PSPACE bound. 

The vertical ip-type of n is defined as the collection of subformulas of ip of the 
form , Och'>P or 's^chV' that hold at n, along with the formula a(x) where a is the 
label of n. 

The following lemma generalizes an obvious fact about the usual stutter-free 
temporal logic on words: 

Lemma 6. There are polynomially many (in \(p\) vertical tp type changes along 
any path p. 

Proof. Consider a path p of T and a node n of p. If n J£ OchVn then in all 
subsequent nodes n' in the path, n! |^= Och"0- Similarly if n ^= <$>chV'; then in 
all previous nodes n' in the path, n' |^ OchV'- We therefore have that these 
subformulas change their truth assignment at most once in p. 



We are now ready to prove the polynomial depth bound. Consider any 
(downward) path p in the tree. By Lemma Rn there are polynomially many 
vertical type changes along a path. 

Consider a maximal interval of p all of whose nodes have the same vertical 
type, and let "High and "Low be the first (highest) and last (lowest) nodes of the 
interval. Now consider the tree t' = t(n^ ow — ► "High) constructed by overwriting 
"High with riLow 

Let / be the partial function taking nodes in t that are not removed to their 
images in t'. 

As with all of our collapse operations, our goal is to show: 

Claim. For any subformula p of ip and node m in the domain of /, we have that 
t, m \— p o t' . f(m) \= p. 

Thus performing this operation on every interval shrinks p without impacting 
cp, and iterating over all p gives the depth bound. We prove this by induction on 
p. Atomic propositions and boolean combinations are immediate. 

We begin by considering p — Och'0- If t, m \= p then there is a node w below 
m satisfying <f> in t. If w is in the domain of /, we are done by induction, so 
assume w is a descendant of m that is not in the domain of /. Thus w is also 
a descendant of "High- Since n^gh has the same downward- type as "Low, "Low 
has a descendant satisfying p, and this can be used as a witness. In the other 
direction, assume t',f(m) \= p. There must therefore be a path of nodes in t 
starting with m leading to a node w' where ip holds, and w' must be of the form 
f(w) for w in t. By induction w can be used as a witness that t,m j= p. A similar 
argument holds for p = ^ch^- 

Note that the sibling nodes of a given node m in the domain of / are not 
impacted by the overwrite operation. Using this it is easy to see that the induction 
cases for the sibling axes (e.g. p = OnsVO g° through. 

This completes the proof of the claim. Iterating the claim gives the proof of 
the first part of the theorem. 

Proof of Theorem [5] 

Recall the statement: 

The satisfiability of TO [V anc o/] with the unary alphabet restriction is NEXPTIME- 
hard. 

Proof. We make use of a standard NEXPTIME-complete problem, tiling an expo- 
nential sized grid Boa97 . 

The input consists of a number n (in unary), a set C = {1, . . . , k} of colours, 
and a vertical and horizontal constraint V, H C C x C . A tiling is a mapping 
/ : {1, 2, . . . 2™} x {1, 2, . . . 2"} — > C, and a solution to the tiling problem consists 
of a tiling such that the vertical and horizontal constraints are satisfied. 

Our formula will have in its signature predicates 

ZeroXi, OneXi, . . . , ZeroX n , OneX„, ZeroYi, OneYi, . . . , ZeroY„, OneY„ 

representing bits in the binary representation of the x- and ^-coordinates of a grid 
position, along with predicates C\ . . .Ck for the colours, and finally a predicate r 



for the root. We code a tiling / by a tree consisting of branches of depth 2n + 2 
for each grid position {1, 2, . . . 2™} x {1,2,... 2™}. If f(x, y) = c then the branch 
will consist of a root, followed by n nodes, where the i th is labelled with ZeroXj 
if the i th bit of x is and is labelled with OneX; otherwise. The branch will then 
have n nodes coding the y-coordinate, labelled with ZeroY^ or OneY,;. and finally 
a leaf labelled with c. Our F0 2 [V anc o/] formula <p will describe the encoding of 
a valid T-tiling /. It will include conjuncts enforcing the shape above: 

— There is a node with no ancestors labelled r, and this node has a descendant 
labelled with ZeroXi and another descendant labelled OneXi. 

— Any node with label ZeroXj or OneX; for i < n has a descendant labelled 
with ZeroXi + i and another with OneXj+i, such a node has no descendants 
labelled with ZeroXj, OneXj for j < i. 

— Any node with label ZeroX„ or OneX n has descendants labelled with ZeroYi 
and another with OneYi, and has no descendants labelled with ZeroXj, OneXj 
for j < n. 

— Any node with label ZeroY^ or OneY^ for i < n has descendants labelled with 
ZeroYi + i and another with OneYi + i, and all its descendants are labelled 
with ZeroXj, OneXj for j > i or with c G C. 

— For any node with label ZeroY n or OneY„, there is some c £ C such that n 
has a descendant labelled c and no descendants with labels other than c. 

— Nodes labelled with c € C are leaves. 

One can then write a formula SAME-X(a;,y) that checks whether two leaf 
nodes have the same x-coordinate: 

SAME-X(a;,2/) = f\((3yy AncOf x A ZeroX^y)) o (3xx AncOf y A ZeroXj(z))) 

i 

In the same way we can define SAME-Y(a;, y) to check whether two nodes 
agree on their y-coordinate, and PLUS-X(x, y), PLUS-Y(x, y) to check whether 
two nodes represent consecutive x- and y-coordinates, respectively. 

The formulas above still allow the possibility of many branches with the 
same co-ordinates but different colors, but this can be enforced by the following 
formula, where LEAF(a;) states that a: is a leaf: 

Vx Vy (LEAF(ir) A LEAF(t/) A SAME-X(x, y) A c(x)) -> c{y) 

The vertical and horizontal constraints can be enforced in the usual way given 
the formulas described above. For example: 

VxVy(LEAF(x)ALEAF( 2 /)ASAME-X(x,y)APLUS-Y(a;,y)Ac(a;))^ \J c'(y) 

(c,c')ev 

Conjoining these sentences gives an F0 2 [V anc o/] sentence that holds on UAR 
trees iff a tiling exists. 



Proof of the polynomial alternation bound (Lemma [3 
Recall the statement of Lemma 



Consider an FO [V anc o/] formula ip over unary predicates in S, and a tree t 
satisfying the UAR. For any symbol a € t, and any root-to-leaf path p = 
Pi ■ ■ -PmaxM i n t> the set p(ip,a) :— {i | t,Pi \= i\> A a(x)} is made up of at 
most \ij}\ 2 a-intervals (i.e., intervals in the set {i \ t,pi \= a(x)}.) 

The result relies on the following combinatorial lemma, which is adapted from 
the argument in Lemma 2.1.10 of Weis [Weill] . Analogously to the terminology 
above, given a word w = w\ . . . w max ^ and a symbol a, by an a-interval we 
mean an interval in the set of positions in w that have label a. 

Lemma 7. Consider a word w, a symbol a, formulas <fi(x) : i < r, and L,U 
functions that assign each boolean valuation of the <fi(x) to positions of w. Let /3 
be a positive boolean combination in propositions P\ . . . Pj and consider the set 

J(w) : = {j £w\ w(j) = aA (w,j) \=f3(ipi, . . . tp r )A 

(j>i(Val(j))Vj<C/(Val(j)))} 

where Val(j') is the boolean valuation of ipi : i < r induced by j in w. Suppose 
that for each i <r the set of position of w labelled with a satisfying ifi consists of 
at most \<Pi\ 2 a-intervals. Then the number of endpoints of a-intervals comprising 
J(w) is at most 4 + 2(S i \ip l \) 2 . 

We first show how Lemma [3] follows from Lemma [7] We proceed by induction. 
The base step follows using the UAR, since for the predicate b(x) the set p(b, a) 
is either empty or a single a-interval. The cases for the boolean operations are 
routine. 

In the induction step for existential quantification, we consider a formula 
ip(x) = 3y5(x,y), where 5{x,y) is: 

(3{x DescOf y,x = y,x AncOf y,x InComp y,(fi,.., ip r ,pi, . . . p s ) 

We can assume /3 is normalized to be a disjunction of formulas /?DescOfj 
/^AncOfj AnComp, P=> where ftoescOf (x , y) implies y DescOf x, and similarly for the 
others. Thus in turn %p is the disjunction of V'DescOf) V'AncOf > V'inComp, ip= where ipR 
existentially quantifies over /3^j. 

For a boolean valuation a of the ^'s, and for a relation R in DescOf, AncOf, 
InComp, =, we let 8(cr,R)(y) be the formula obtained from S(x,y) by replacing 
all (fi(x) in S by true or false according to er, formula R(x,y) by true, and all 
other binary formulas by false. 

Fixing a root-to-leaf path p — pi . . .p max i v ) in tree t (that is, where p\ is the 
root, Pmaxip) a leaf), and a a boolean valuation of the ip^s let: 
~ ^InComp (c) represent the smallest i such that 

3n G t ■ n InComp pi A t, n \= S(a, lnComp)(y) 

- UDescOf(&) represent the largest i such that 

3n E t ■ n DescOf Pi At,n \= S(a, DescOf) (y) 



— ^AncOf (c) represent the smallest i such that 

3n et ■ n AncOf piAt,n |= 5(a, AncOf)(y) 

Unwinding the definitions, we can check that a node pj in the path p within 
t satisfies ip exactly when, letting cr(J) be the boolean valuation of the (pi's such 
that t,pj |= ipi(x), we have either: 

— j < C^DescOf(c r (j)) (thus pj has a witness to S(a(j), DescOf), and hence 
a witness to ip which is a descendant). 

— J > i|nComp(' 7 (j)) (thus £>j has a witness to ?/> that is incomparable to it). 

— j > -^AncOf ( CT 0')) (Pj nas a witness to ^ which is an ancestor). 

— t,Pi \— ip=(x), where ip- is defined above. 
Restricting attention to ^DescOf V V'AncOf V ^inComp, we can apply Lemma 

7] above, letting L(a) be the max of LinComp(o') and ^AncOf (c) and U(a) be 

U D escOf(cr) + 1- 

We thus get that the number of boundary points of a-intervals comprising 
p(^DescOf V T/Wof V ^inComp, a) is at most 4 + 2(U l \ip i \) 2 . 

The boundary points of p(ip—,a) are those of the p(pi, a), and applying the 
induction hypothesis to these, we get a bound on the number of endpoints of 
intervals comprising p(ip, a) as 







2{E i \<p i \) 2 + 2E i \fi 



I 2 



which is bounded by 2 • \ip\ 2 . Thus the number of intervals is bounded by \ip\ 2 . 
This completes the proof of Lemma [3] 

We now proceed to the proof of Lemma [7J 

We follow the approach of Lemma 2.1.10 of [Weill and focus on the modifi- 
cations of the two main claims used in the proof of that lemma. For a formula 
ip(x) and letter a, let w(ip, a) = {i € w : w, i |= ijj(x) A a(x)}. 

For u < r, let F u be the set of left boundaries of a-intervals that comprise 
w(ip u , a), and let G u be the set of right interval boundaries, where (by convention) 
we take the decomposition into a-intervals of w(ip u ,a) to be such that the 
boundary points are labelled with a, the right (upper) boundary is not part of 
w(ip u , a) but the left boundary is in w((p u , a). Let F and G be the total set of 
left and right interval boundaries of S, and let H = F U G U {1, ||io|| + 1}. 

Consider each interval / defined by two consecutive elements of H . The truth 
values of the ipi are constant on such an interval, thus the truth value of if on 
positions j in this interval is determined by where j is relative to L(\/a\(j)) and 
[/(Val(j)). Let C be H unioned with all points of the form L(Val(j')) + 1 or 
[/(Val(j)). _ 

For a right (upper) interval boundary d in H, we let q(d) be the point 
L(Val(j')) + 1 for j in the interval (all such points agree on Val(j)) to the left of 
d, if such a point exists; q(d) is undefined otherwise. For a left (lower) interval 
boundary c in H, we let p(c) be the point C/(Val(j)) to the right of c within 
the interval, if it exists, and let p(c) be undefined otherwise. We let P(c) = p(c) 
exactly when p(c) is a right boundary point of J(w) - that is, an a-labelled 
position lying outside of the set, with the a-position immediately below it lying 



in the set. Let p(c) be undefined otherwise. Similarly let Q(c) = q(c) when q(c) 
is a left boundary point of J(w). 

Let F a be the union over all F v with v =/= u, and define G a analogously. 

Claim. Given c and d consecutive interval boundaries from F a , there is at most 
one i€ F u n [c, d) with P{i) ^ 0. 

Proof. Suppose there is i € F u n [c,d) with P(i) ^ and consider another 
j (z F u D [c,d) with j < i. Since the interval [c,d) contains no left interval 
boundaries besides the ones from F u , and since i and j are both in F u , and 
hence are both in w(tp u ,a), we conclude that every tpk : k < r that holds in 
the interval starting from i also holds at the interval starting from j. Thus 
Val(j) = Val(i). If p(j) is a right boundary point of J(w), it must be that the 
positions immediately below it are in the set J(w), and thus these positions must 
satisfy x < /7(Val(x)). Once truth values for the ipk ■ k < r are fixed (and hence 
Val(x) is fixed), the positions satisfying x < [/(Val(x)) are closed downwards. 
Note that i < p(i), by definition of p(i), and therefore we must have that i and 
j both satisfy x < t/(Val(x)). Combining with the fact that i and j agree on 
ifk : k < r, we see that the interval above j agrees on J(w) with the interval 
above i, and thus P(j) must be empty. 

Let C(i) be the set of boundary points contributed by i: namely P(i) if it 
exists, Q{i) if it exists, and also i if it is a boundary point of J(w). 

Claim. Given c and d consecutive interval boundaries from Fu, and i £ F U D [c, d) 
with i £ G, Q(i) ^ 0. Then we have i £ C(i). 

Proof. Fix c, d, i as in the claim. Since i ^ G, i is not a right interval boundary 
of any set p(tpj,a), and therefore the ipj that are true at the interval ending 
at i are also true at the interval starting at i. Furthermore Q(i) 7^ implies 
that L(\/a\(x)) < x holds for x above Q(i), and thus will hold for all adabelled 
positions sharing Val(i) above i. Thus i cannot be a boundary point for J(w), 
and therefore i $ C(i). 

The rest of the argument follows that in [Weill] precisely 
The above two claims imply that for every i G F u fl [c, d) — G except possibly 
one element, C(i) is either empty, contains the single element Q(i), or contains 
only i. At the one exceptional element C(i) could consist of at most two elements, 
P(i) and either Q(i) or i (but not both, by the second claim). 

Therefore, [J ieF n r c d)-G nas a ^ mos t Wu fl [c, d)| + 1 elements. Unioning over 
all intervals [c, d) we get 

E iEFu _ G \C(i)\ < ScefAIFu n [c,d)\ + 1) = |F«| + |F tt | 

Using again the fact that each C(i) contains at most two elements (see above), 
we also know Ei e p u -G\C{i)\ < 2 ■ \F U \, and thus: 

Z iEFu _ G \C(i)\ < \F U \ + mm{\F u \, \F a \} 



Since for each j, the number of intervals, and hence the number of left endpoints 
of intervals, is assumed to be at most |y>j| 2 , an d using that the sum of squares is 
less than the square of a sum we get: 

S ieF u -G\C(i)\ < \f u \ 2 + min{\<f u \ 2 , (Z^ u |<^|) 2 } 

< \ip u \ 2 + \(p u \ ■ min{\ip u \, Si^ u \cpi\} 

< \ip u \ 2 + \(p u \ ■ Si^ u \(fii\ 

= \ip u \ ■ S l \(f i \ 

By a symmetric argument we get 

S i£G u -F \C{i)\ < \<p u \ ■ Ei\<Pi\ 

Now the total number of boundary points for J(w) is at most the endpoints 
of the path, the highest value of U and the lowest value of L, plus the union over 
i of C(i). Thus we have that the total number is at most: 

4 + S u 2 ■ \<p u \ ■ Silifi] < 4 + 2 • (^|^;|) 2 

This completes the proof of Lemma [7] 

Proof of Theorem [9] 

Recall the statement: 

The satisfiability problem for FO [V par o/] is NEXPTIME-/iard, even with the 
unary alphabet restriction. 

Proof. Clearly, the UAR has no impact, since n predicates on a single node can 
be simulated by considering the labels of the n nearest ancestors. 

We reduce from tiling a 2™ by 2™ grid with tiles T\ . , . T m in such a way 
to satisfy a given vertical constraint V and horizontal constraint H . We let 
E n be an alphabet with symbols ZeroXi, OneXi, . . . , ZeroX„, OneX„, ZeroYi, 
OneYi . . . , ZeroY„, One\ n: Ti . . .T m . Consider trees in which: nodes at level 
i < n are labelled with ZeroX^ or OneX.;, each node of level i < n — 1 has both 
a ZeroX; + i and an OneXi + i child. Similarly nodes at level n + 1 < i < 2n 
are labelled with ZeroY^ or OneY^. Each node of level n has both ZeroYi and 
an OneYi child, each node of level n + 1 < i < 2n — 1 has both a ZeroYi + i and 
an OneY i+1 child. 

Finally, each node of level 2n has a single child labelled with one of the 
Tj. Such trees represent a tiling of the grid. It is easy to write an F0 2 [V par q/] 
formula describing such trees, and also requiring that the horizontal and vertical 
constraints are satisfied. 

Completion of the proof of Theorem [8] 

Recall the statement: 

The satisfiability problem for FO [V anc o/] over ranked schemas is in NEXPTIME, 
and is thus NEXPTlME-complete. 



We first prove the key lemma, Lemma [5j Recall that in this lemma, we replace 
node n by node n' , where n and n' are not in the protected witness set W and 
share the same ^>-type, the same set of ancestor (p-types, and the same set of 
selected descendant ip-types. The lemma then claims: 

For all to € T\ the one-variable subformulas of tp satisfied by to in t are the 
same as those satisfied by /(to) in t' . Moreover, for every node m' in T 2 , the 
one-variable subformulas of <p satisfied by m' in t' are the same as those satisfied 
by/- 1 (to') int. 

We prove both parts of the lemma by simultaneous induction on the struc- 
ture of the formula, where the case of atomic propositions and the case of 
boolean combinations are trivial. The only interesting case is for subformulas 
P = 3y0(x,y). 

We first note the following key property of the witness set W: For nodes to 
of t, if there is aw incomparable to to such that t,m,w \= /3(x, y), then there is 
such a w in W. 

To prove this, fix m and w such that the hypothesis holds. Let w T be the 
basic global witness for the 92-type of w. If w T is incomparable to to, then w T has 
the required property. If w T is a descendant of to, then we would have thrown 
in the necessary w into W as an incomparable global witness for to. If w T is an 
ancestor of to or equal to to, we would have thrown in the necessary w into W 
as an incomparable global witness for w T . 

We begin by comparing formulas p between a node to of the old tree (i.e. 
m € T\) and the same node considered in the new tree. We first consider the case 
where p holds at to in t, and show that p remains true at its image /(to) in t' . 

— If the witness of the truth of tp was to or its ancestor, then these are also 
in Ti, and thus are preserved under the mapping, so by induction they (i.e. 
their image under /) can serve as a witness in t' . 

— Suppose there is a witness w that is neither to, nor an ancestor of to, nor 
a descendant of to. By the key property of W, there is a witness w' in the 
set W that is also incomparable to to, and has the same v?-type as w. This 
can be used as a witness. 

— The last possibility is that some of the witnesses are descendants. If at least 
one of these is not in SubTree(i, n), then it is preserved and can be used as 
a witness. Otherwise, the witness must be in SubTree(i, n). If n itself was 
a witness, then since it was replaced by an n' such that Tp (n') = Tp (n') 
we can use the copy of n' as a witness, by induction. On the other hand, 
if there was a descendant of n which was a witness, then there would 
have been a witness w" such that Tp (w") G Selected DescTypes(n). Since 
Selected DescTypes(n) = SelectedDescTypes(n') we would be able to find a wit- 
ness with the appropriate (p-type in a copy of the subtree rooted at n'. 

We now consider the case where p holds at a node to' that is the image of 
a node to in SubTree(t,n') under the overwriting operation, and aim to show 
that p holds at to. Note that once this is shown, the other direction of the if 
and only if for nodes in Ti follows easily by induction. So fix such to' and to. 
The only non-trivial case is for to' being a copy of n', with the witness being 



its ancestor. Here we can use as a witness one of the ancestors of n, because 
AncTypes(n) = AncTypes(n'). 

This completes the proof of Lemma [5] The argument for Theorem [8] for UAR 
trees proceeds by repeatedly updating while such nodes are available. The process 
terminates, as argued in the body of the paper. 

The extension for ranked schemas follows along the same lines, but in order 
to collapse nodes n and n', we require in addition that the tree automaton A 
reaches the same state at n and n'. 

Proof of Theorem [7] 

Recall the statement: 

There are FO [VancOfl formulas ip n of size O(n) that are satisfiable over UAR 
binary trees, where the minimum depth of satisfying binary UAR trees grows as 
2™. 

Proof. We let S n consist of {b, s} U {a^ : i < n}. 
We consider trees in which: 

— the root is labelled b 

— nodes labelled b are always comparable via descendant 

— nodes labelled s are never comparable via descendant 

— every ancestor of a 6-labelled node is labelled b 

— every ancestor of an s-labelled node is labelled b 

— descendants of s-labelled nodes can be labelled with any of the at (but not 
with 6) 

These conditions can easily be enforced by an F0 2 [V anc o/] formula. 

In such trees the 6-labelled nodes must go down a single branch, with s-labelled 
nodes splitting off on a separate branch. See Figure [2] We now let ipi ■ i < n 
be the formula that holds at an s-labelled node if it has a descendant Oj . Note 
that any combination of the i\)i are consistent, and the set of V'i that hold of 
an s-labelled node can thus be considered an n-bit address for the s-node. We 
can write a formula ip n that asserts that 1. the constraint on the shape of the 
tree above holds 2. there is an s-node with address 0™ 3. for every s-labelled node 
with address a not equal to 1™, there is an s-labelled node whose bit address is 
the successor of a. A binary tree satisfying (f> n must have exponential depth. See 
Figure [2] for an example. 



Details for the proof of Theorem [10 

Recall the statement: 

The satisfiability problem for FO [VnoAncOf], an d the satisfiability problem 
with respect to a rank schema, are in NEXPTIME, and hence are NEXPTIME- 
complete. 

We give the details for satisfiability first. By Lemma [T] we know that a 
FO 2 [VnoAncOf] formula if which is satisfied over trees is satisfied by a tree t of 
depth at most exponential in ip. We also can bound the outdegree of nodes by 
an exponential. 
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Fig. 2. An example model of exponential depth for F0 2 [V anc o/] formula in 
ranked case 



For each ip-type that is satisfied in t, choose a satisfier and include it along 
with all its ancestors in a set W: these are the basic witnesses. Then throw in all 
children of basic witnesses. 

Thus the size of W is at most exponential. Now we transform t to another 
tree t' such that t' \= ip and t' has only exponentially many different subtrees. 

Recall that our update procedure looks for if there are nodes n, nf in t such that 
1. n, n' £W 2. SubTree(t, n) -< SubTree(i,n') is not isomorphic to the subtree 
rooted at n' 3. Tp (n) = Tp (n') and Tp (parent (n)) = Tp (parent (n')) then let 
t' = Update(t) be obtained by choosing such n and n! and applying the collapse 
operation that replaces the subtree of n by that of n' . 

Let T\ be the nodes that were not in SubTree(i,n), and for any node m <G T\ 
let /(to) denote the same node viewed in t' . Let T 2 denote the nodes in t' that 
are images of a node in SubTree(t, n') under the replacement. For each m G T%, 
let / _1 (to) denote the node in t from which it derives. 

We claim the following: 

Lemma 8. For all m G T\ the tp-type of m in t is the same as the (p-type of 
/(to) in t! . Moreover, for every node m! in T 2 , the p-type of to' in t' is the same 
as the (p-type of f~ 1 {m) in t. 

Applying the lemma above to the root of t, which is necessarily in Xi, it 
follows that the truth of the sentence p is preserved by this operation. 

Proof. We prove both parts of the lemma by simultaneous induction on the 
structure of the formula, where the case of atomic propositions and the case of 
boolean combinations are trivial. The only interesting case is for subformulas 
ip = 3yf3(x,y). 

We begin by considering formula ip at node to G T±. We first consider the 
case where <p holds at to. 



— If the witness of the truth of ip was m or its parent, then these are also in T\, 
and thus are preserved under the mapping, so by induction they (i.e. their 
image under /) can served as a witness in t' . 

— Similarly, if the witness was a sibling of m, then it can serve as a witness in 
t', since the collapse map does not impact the sibling relations. 

— If all witnesses are neither a parent nor a child of m, then take one such 
witness w and an element w' in W that realizes the same (p-type as w. w' 
must be neither a parent or a child of m (since if it were a parent, m would 
have been a child witness, and hence protected). Thus w' can be used as 
a witness. 

— The last possibility is that some of the witnesses are children. If at least 
one of these is not in SubTree(i, n), then it is preserved and can be used as 
a witness. Otherwise, n itself must be a witness. It was replaced by an n' 
such that Tp (n) = Tp (n') so the copy of n' can be used as a witness, by 
induction. 

We now consider the case where i/j holds at a node m! £ T-i that is the image 
of a node m £ T, and aim to show ip holds at m. The only non-trivial case is for 
m! being the image of n' , with the witness being its parent. Here we can use as 
a witness the parent of n, because Tp v of the parent of n is the same as Tp^ of 
the parent of n' . 

We now iterate the procedure U+i := Update(t,), until no more updates 
are possible. Since t i+ i -< ti, the process must terminate. The resulting tree will 
contain only exponentially many different subtrees. We can thus represent it as 
a DAG, with one node for each subtree. 

Thus we have shown that any satisfiable formula has an exponential-size 
DAG that unfolds into a model of the formula. Given such a DAG, we can check 
whether an FO formula holds in polynomial time in the size of the DAG. Thus 
we have a NEXPTIME algorithm for checking satisfiability. 

The modification in the presence of a ranked schema is straightforward — 
again we show that there is an exponential-sized DAG. Given a bottom- up 
tree-automaton, the modification procedure Update only replaces n by n! if, in 
addition to the criteria above, their subtrees reach the same state of A. Clearly, 
the state of A is also preserved by this replacement. This completes the proof of 
Theorem fTOl 



